Goto

Collaborating Authors

 zier curve






Bridging Vision, Language, and Mathematics: Pictographic Character Reconstruction with Bézier Curves

Wan, Zihao, Xu, Pau Tong Lin, Luo, Fuwen, Wang, Ziyue, Li, Peng, Liu, Yang

arXiv.org Artificial Intelligence

While Vision-language Models (VLMs) have demonstrated strong semantic capabilities, their ability to interpret the underlying geometric structure of visual information is less explored. Pictographic characters, which combine visual form with symbolic structure, provide an ideal test case for this capability. We formulate this visual recognition challenge in the mathematical domain, where each character is represented by an executable program of geometric primitives. This is framed as a program synthesis task, training a VLM to decompile raster images into programs composed of Bézier curves. Our model, acting as a "visual decompiler", demonstrates performance superior to strong zero-shot baselines, including GPT-4o. The most significant finding is that when trained solely on modern Chinese characters, the model is able to reconstruct ancient Oracle Bone Script in a zero-shot context. This generalization provides strong evidence that the model acquires an abstract and transferable geometric grammar, moving beyond pixel-level pattern recognition to a more structured form of visual understanding.




Vector sketch animation generation with differentialable motion trajectories

Zhu, Xinding, Yang, Xinye, Zheng, Shuyang, Zhang, Zhexin, Gao, Fei, Huang, Jing, Chen, Jiazhou

arXiv.org Artificial Intelligence

Sketching is a direct and inexpensive means of visual expression. Though image-based sketching has been well studied, video-based sketch animation generation is still very challenging due to the temporal coherence requirement. In this paper, we propose a novel end-to-end automatic generation approach for vector sketch animation. To solve the flickering issue, we introduce a Differentiable Motion Trajectory (DMT) representation that describes the frame-wise movement of stroke control points using differentiable polynomial-based trajectories. DMT enables global semantic gradient propagation across multiple frames, significantly improving the semantic consistency and temporal coherence, and producing high-framerate output. DMT employs a Bernstein basis to balance the sensitivity of polynomial parameters, thus achieving more stable optimization. Instead of implicit fields, we introduce sparse track points for explicit spatial modeling, which improves efficiency and supports long-duration video processing. Evaluations on DAVIS and LVOS datasets demonstrate the superiority of our approach over SOTA methods. Cross-domain validation on 3D models and text-to-video data confirms the robustness and compatibility of our approach.



Continuous-Time Control Synthesis for Multiple Quadrotors under Signal Temporal Logic Specifications

Yuan, Yating

arXiv.org Artificial Intelligence

-- Ensuring continuous-time control of multiple quadrotors in constrained environments under signal temporal logic (STL) specifications is challenging due to nonlinear dynamics, safety constraints, and disturbances. This letter proposes a two-stage framework to address this challenge. First, exponentially decaying tracking error bounds are derived with multidimensional geometric control gains obtained via differential evolution. These bounds are less conservative, while the resulting tracking errors exhibit smaller oscillations and improved transient performance. Second, leveraging the time-varying bounds, a mixed-integer convex programming (MICP) formulation generates piecewise Bézier reference trajectories that satisfy STL and velocity limits, while ensuring inter-agent safety through convex-hull properties. Simulation results demonstrate that the proposed approach enables formally verifiable multi-agent coordination in constrained environments, with provable tracking guarantees under bounded disturbances. I. INTRODUCTION As drone technology progresses, quadrotors are increasingly required to execute complex tasks in confined environments--particularly narrow passages and strict terminal zones [1]. In this context, signal temporal logic (STL) offers a formal language to define tasks over continuous signals with explicit time semantics.